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ABSTRACT 


The lack of truly reliable data for climate change analyses and prediction presents challenges in 
climate modeling. Needed data are required to be hydrologically/statistically reliable to be useful 
for hydrological, meteorological, climate change, and estimation studies. Thus, data quality and 
homogeneity screening are preliminary analyses. In this study, the homogeneity of the climatic data 
used for analyses of climate variability was conducted in the coastal region of Nigeria. Climatic 
Research Unit (CRU 0.5x 0.5) gridded monthly climatic data for sixty years (1956- 2016) for nine 
states of the coastal region of Nigeria obtained from internet sources were validated with the 
Nigerian Meteorological Agency (NiMet) data to assure adequacy for use. The data were tested for 
normality using the Shapiro-Wilk (S-W) test, D’Agostino-Pearson omnibus test, and skewness test. 
Four homogeneity test methods were applied to 257 locations in the nine states of the coastal 
region of Nigeria and they include Pettit’s, Standard Normal Homogeneity Test (SNHT), 
Buishand’s and Von Neumann Ratio (VNR) tests. The results of the validity analysis indicated that 
the CRU data are very reliable and thus justified their use for the further analysis carried out in the 
study. Also, the results obtained indicated that CRU climatic data series were normally distributed 
and parametric methods could be used in further analysis of the data. Rainfall data homogeneity 
was detected for Bayelsa, Delta, Edo, Lagos, Ogun, and Ondo states and inhomogeneity for Akwa 
Ibom, Cross Rivers, and Rivers States. Also, temperature data inhomogeneity was detected for all 
the states in the study area. 
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1.0. Introduction 


Reliable and accurate estimates of climate are not only crucial for the study of climate variability but 
are also important for water resource management, agriculture, weather, climate, and hydrological 
forecasting (Sarojini et al., 2016). Unfortunately, there is the dearth of satisfactory climatic data in 
developing countries. The rain-gauge measurement is the traditional and oldest method for monitoring 
rainfall. However, because of the practical observational limitations, this measurement often suffers 
from numerous gaps in space and time, due to weather stations being limited in numbers and often 
unevenly distributed, resulting in missing data problem, a short period of observation, incomplete 
areal coverage, and deficiencies over most oceanic and sparsely populated areas (Kidd et al.,2017), 
thus making its use in climate change diagnostic studies less reliable in initial data processing and 
calibration problems of subjecting non-continuous rainfall and temperature data into the Water 
Balance and TREND software. This may arise as this software often recognizes only continuous data 
of long duration over fifty (50) years. Unfortunately, available data from NIMET either had missing 
data or are not up to 50 years in some locations. Often, this basic requirement for use is not met and 
the situation is compounded by sparse data coverage challenge in the region due to the few weather 
stations per state most of which are located in the airports. Studies have shown that a major constraint 
in climatic change research identified in the past 1s the lack of long-term climatic data on a temporal 
basis (Nnamchi et al., 2008). Mitchel and Jones (2005) recommended that a large proportion of such 
data needs can be met through providing a standard set of ‘climate grids’, in terms of monthly 
variations over a century-long time scale on a regular high-resolution (0.5°) latitude-longitude grid. 
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The extensive application of data the CRU TS datasets for studying climatic variability has been used 
by converting climatic datasets into formats that are more commonly used and therefore can be 
directly utilized in GIS. In long-term modeling of climatic variability, the application of the CRU 
dataset is well documented. 


Data taken from the observation stations should be tested for reliability and homogeneity before their 
use in the research studies. Homogeneity testing is one of the most important analyses in climate- 
related studies as it underpins the reliability of any inferences (Yozgatligil and Yazici, 2015). The 
accuracy and reliability of the model result in the studies about climate change, classification, flood 
and drought modeling, water resources planning and modeling related to hydrology and meteorology 
vary according to the quality of the data used. Inhomogeneous observation records may occur in the 
stations making meteorological observations unreliable due to the method used, the conditions around 
the station and the reliability of the measurement tool, etc. 


Homogeneous climate series may be defined as a series only influenced by the variations in climate. 
However, it is difficult to find reference stations with a high correlation and a homogeneous structure 
in wide regions. For this reason, the absolute method was used for the homogeneity test in our study 
owing to the high spatial variation of precipitation stations. The standard procedure is to apply four 
tests: the Pettit, Standard Normal Homogeneity (SNH), Buishand (BR), and Von Neumann (VNR) 
tests at a significance level of 0.05 (Agha et al., 2017). The temperature series were tested using the 
annual mean temperature while the Precipitation series were tested using the annual mean rainfall. 
The use of derived annual variables avoided autocorrelation problems with testing daily series. 


There are some differences between SNHT, BRT, and Pettitt test. SNHT test is known to find change 
point towards the beginning and the end of the series, whereas BRT and Pettitt tests are sensitive to 
find the changes in the middle of a series (Martinez et al., 2009). These three tests are capable of 
detecting the year where a break occurs. Meanwhile, VNRT assumes the same null hypothesis as the 
previous three tests but for the alternate hypothesis, it assumes that the series is not randomly 
distributed. VNRT assesses the randomness of the series but does not give information about the year 
of the break. Homogeneity of consistency implies that all the collected hydrologic time series data 
belong to the same statistical population having a time-invariant mean. Therefore, the tests to check 
the homogeneity or consistency of the data series are based on evaluating the significance of changes 
in the mean value. To be accurate, the climate data used for long-term climate analyses, particularly 
climate change analyses, must be homogeneous. A homogeneous climate time series is defined as one 
where variations are caused only by variations in weather and climate. Unfortunately, most long-term 
climatological time series have been affected by several non-climatic factors that make these data 
unrepresentative of the actual climate variations occurring over time. 


Climate data homogenization aims to adjust observations, if necessary so that the temporal variations 
in the adjusted data are caused only by climate processes. 


2.0. Methodology 


2.1. Description of study area 


The study area is the coastal region of Nigeria. The area is geographically located between latitude 
4°N — 8°N and longitude 3°E — 9°E. Figure | presents the map of Nigeria showing the coastal region 
station coordinates of representative cities in the coastal region of Nigeria and other states details are 
presented in Table 1. The Nigeria coastline which is about 853km long runs through nine states of 
Nigeria namely: Lagos, Ogun, Ondo, Edo, Delta, Bayelsa, Rivers, Akwa-Ibom and Cross River states, 
bordering the Atlantic Ocean. Nine stations, each from these states, were selected for representative 
coverage. Nigerian coastal zone experiences a tropical climate consisting of the rainy season (April to 
November) and dry season (December to March). High temperatures and humidity as well as marked 
wet and dry seasons characterize the Nigerian climate. The coastal areas have an annual rainfall 
ranging between 1, 500, and 4,000 mm. The coastline area is humid with a mean average temperature 
of 24 — 32°C (Kuruk, 2004). The coastal area is low lying with heights of not more than 3.0m above 
sea level and is generally covered by freshwater swamp, mangrove swamp, lagoon marshes, tidal 
channels, beach ridges, and sand bars (Nwilo and Badejo, 2006). 
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The station coordinates of representative cities in the coastal region of Nigeria and the map of the area 
are presented in Table 1 and Figure | respectively. 


Table 1: Station coordinates of representative cities in the coastal region of Nigeria and other states 
details 


S/N_ | Coastal state Land Area (km*) | Population Selected city Longitude | Latitude | Years of 
density available 
(Persons/km”) MIMET data 


1 | Akwalbom [8421 | 466 | Uyo SHE SION 1 60 
/2 | Bayelsa | 21,100 | 81 | Yenogoa | 6.26°R | 4.92°N_ | 60 
| 3 | CrossRivers | 23,074 | 125, | Calabar | 8.20 | 4 STPN OO 
4 | Delta 70M | Wart SSPE SSN OO 


15,650 5.31°E 6.20°N 


oO 

Sa E-7h i hnc ll 
water) 

}7 | Ogun | 16720, 223 | Abeokuta | 3.35°E | 7.16°N | 60 

/8 [Ondo | 14769 233 ure SSPE 7-ISN [60 

! [oT 


fo [Riess 
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Figure 1: Map of Nigeria showing the coastal region 
[Source: Adapted from Office of the Survey General of the Federation (OSGOF)] 


2.2. Data collection/validation 


Climatic data collected were mean monthly rainfall and temperature for a period of 60 years (1956- 
2017) for selected cities in the coastal region of Nigeria. Climate Research Unit (CRU 0.5x 0.5) 
gridded monthly climatic data for two climatic periods ((1956- 1986 and 1987-2016), obtained from 
the internet (http://badc.nerc.ac.uk), were sorted into annual rainfall series and validated with the 
Nigerian Meteorological Agency (NiMet) data obtained from Central Bank of Nigeria (CBN) website 
https://www.cbn.gov.ng/documents/Statbulletin.asp. The validation was carried out using the 
following goodness of fit statistics: Coefficient of Determination (R*), Nash-Sutcliffe Efficiency 
(NSE) and Ratio of Standard Deviation of observations to Root Mean Squared (RSR). These data 
series from 257 locations in nine states of the region were tested for normality and homogeneity. The 
value of R’* obtained is considered very good when its value is within the range of 0.75 < R’ < 1 (see 
Table 2). The lack of truly reliable data is a problem that complicates the analysis of climate trends, 
increasing the challenges of related relevant research, hence data validation was done. 
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Table 2: General validation rating used 


RSR (-) K tau Validation MAPE Validation Capability 
rating (%) rating 
0 


Cp 
0.75 <R°<1 | 0<RSR< | 0.75<K<1 Very Good 0< 10 Highly <1 Incapable 
5 accurate 
<3 


0. 
0.75 < 0.6 0.75 
0.5< R* <|0.6<RSR 20-50 Reasonable | 3<Cp | Very 

0.65 <0.7 capable 


RSR > 0.7 Unsatisfactory |>50 | Inaccurate | | 


Note that adjusted-R’ can have a negative value, unlike R’, which is always between 0 and 1. We also 
mentioned that adjusted-R’ is roughly equal to R*. Adjusted R-squared more than 0.75 indicates a very 
good value for showing the accuracy. The interpretations of Kendall’s tau and Spearman’s rank 
correlation coefficient are very similar and thus invariably lead to the same inferences. 





2.3. Preliminary data analysis 


2.3.1. Descriptive statistics 

The descriptive statistics of annual rainfall time series which includes; minimum, maximum, range, 
mean, standard deviation, variance, standard error of the mean, kurtosis, and skewness with their 
standard error’s computations were implemented by using XLSTAT software. Basic equations for 
descriptive statistics are presented in Table 3. 


Table 3: Equations for descriptive statistics 


Mean It is obtained by adding together all the variates, }‘(R) and 
. dividing by the total number of variates, N. 
Paneemeraton | ge [s(R— R)? It is the square root of the mean-squared deviation of individual 
O= N observations from their mean Standard deviation. 


Dae ecueepeeend 
Coefficient of | Coefficient of variation, C, = © | It Is obtained by dividing the standard deviation of the data set 
Variation : by its mean. 
of CV 


Determination _ SD x 100 This is a measure of the variability of data 
coefficient of - 


variation 


X 
Skewness oa ¥(R —R) Where R is a variate, R is the mean of the data set and N is the 
total number of variates. 
yy 


1 
N 
Kurtosis Res i=1(Ri-R)* : Kurtosis is the degree of peakedness of distribution, usually 
(DH, (Ri- R)?) taken relative to a normal distribution. A distribution having a 
relatively high peak is called leptokurtic, while a curve that is 
flat-topped is called platykurtic. The normal distribution which 
is not very peaked or very flat-topped is called isocratic. 





2.3.2. Simple t-test 

This test is data-driven and provided a known change point. It was used to check the null hypothesis 
of whether equal means in two different periods are different (Mu et al., 2007). The simple f-test 
assumes that the data are normally distributed. The relationship is expressed as follows: 


(1) 





2 g 
g a [tu =Dsr +n =1)s5 (2) 
n—2 
where xX and y in Equation | stand for the means of the observed and simulated data respectively, 
while n; and m2 are the numbers of data in the observed and simulated respectively, and S is the 


standard deviation of sample (of the entire n; and nz observations, n = n; + n2). We considered the tf 
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Statistic test statistic and the degrees of freedom to determine the p-value. The p-value is the 
probability that a f statistic having n - 2 (22) degrees of freedom is very much greater than 1.725. 
Since this is a two-tailed test, "very much greater" means greater than 1.725 or less than -1.725. 


2.3.3. Test of normality 

Normality test, which was employed to determine whether the dataset could be described by a normal 
distribution, was carried out using Shapiro Wilk Test (SWT), D°Agostino-Pearson Test, and 
Skewness Test. This enabled data screening, outlier identification, description, assumption checking, 
and characterizing differences among sub-populations. 


The basic test of a hypothesis that guides in deciding on normality is stated as follows: 
Null hypothesis Ho: Data follow a normal distribution 


Alternative hypothesis 47: Data do not follow a normal distribution 


The Shapiro- Wilk 

This test is one of the most popular tests for normality assumption diagnostics which has good 
properties of power and is based on correlation within given observations and associated normal 
scores. The Shapiro-Wilk test statistics derived by Shapiro and Wilk (1965) is as shown below: 


re) 3) 
E0-77 


where (1 ) y is the i” order statistics and J, is the i-th expected value of normalized order statistics. For 
independently and identically distributed observations, the values of / can be obtained from the table 
presented by Shapiro and Wilk (1965) for sample sizes up to 50. W can be expressed as a square of 
the correlation coefficient between J, and (i)y. So W 1s the location and scale-invariant and is always 
less than or equal to 1. In the plot of (i)y against J an exact straight line would lead to W very close to 
1. So if W is significantly less than 1, the hypothesis of normality will be rejected. Although the 
Shapiro-Wilk W test is very popular, it depends on the availability of values of /,, and for large 
sample cases their computation may be much more complicated. 


D’ Agostino-Pearson Omnibus Test 

An alternative test of the same nature for samples larger than 50 was designed by D'Agostino (1973). 
To assess the symmetry or asymmetry generally, skewness is measured and to evaluate the shape of 
the distribution kurtosis is overlooked. D’ Agostino-Pearson (1973) test standing based on skewness 
and kurtosis test and these are also assessing through moments. The equation is given as: 


K? =2?(Jb, )+Z7(6,) (4) 


2 2 
where Z (b,) and Z (b,) are the normal approximation equivalent to vb, and b, are sample 
skewness and kurtosis respectively. This statistic follows a chi-squared distribution with two degrees 


of freedom if the population is from a normal distribution. A large value K™ leads to the rejection of 
the normality assumption. 


For the skewness test, the skewness coefficient of a time series X(t) 1s estimated as follows (Karamouz 
et al., 2003): 


. (5) 
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where WN is the number of sample data and X is the sample mean for time series X(t). The skewness 
test 1s based on the fact that the skewness coefficient of a normal variable is zero. If the series is 
normally distributed, “Sis asymptotically normally distributed with the mean of zero, variance of 6/N, 
hence, (1-a) x100% confidence limit on skewness is defined as, 


Se mea w’ (2 — (6) 


where Z(q2) 1s the (1—a/2) quantile of the standard normal distribution. Therefore, if “S falls within the 
limits of Eq.6, the hypothesis of normality cannot be rejected. The test is found to be reasonably 
accurate for N > 150. 


The Shapiro-Wilk (S-W) test, D’Agostino-Pearson omnibus test, and skewness values are also 
displayed as the output of descriptive statistic explained in the last section and the result is presented 
in Section 3.1.2. 


2.4. Data analysis 


2.4.1. Test for homogeneity 

Homogeneity tests help in assessing trend reliability and identifying suitable sub-periods for the 
analysis. Homogeneity test was carried out by use of Pettit’s test, Standard Normal Homogeneity Test 
(SNHT), Buishand’s test, Von Neumann Ratio which was implemented by the XLSTAT software. 
The basis of these tests corresponds to the alternative hypothesis of a single shift. For all tests, p- 
values were being calculated using Monte Carlo resampling. Test of hypothesis would guide in taking 
a decision on from results of homogeneity on the tables. The hypothesis is stated as follows: 


Null hypothesis: Ho: Data are Homogeneous 
Alternative hypothesis: H, There is a date at which there is a change in the data 


Test interpretation: 

Ho: Data are homogeneous 

Ha: There is a date at which there is a change in the data. The p-value shows that the null hypothesis 
is rejected; we can conclude that there is a shift between two parts of our time series. The associated 
pilot confirms this result. The SNHT test (Standard Normal Homogeneity Test) is usually applied to a 
series of ratios that compare the observations with an average. The ratios are then standardized. The 
null hypothesis are: Ho: The obtained ratios follow an N(O,1) distribution. 


Since the p-value is very small, we reject the null hypothesis and thus conclude that there exists a shift 
in the time series. This result confirms the result of the first test. The Buishand’s test can be used on 
variables following any type of distribution. It 1s based on the null hypothesis: Ho: The 7 variables 
follow one or more distributions that have the same mean. Since the p-value is very small, this 
hypothesis is rejected and the alternative hypothesis will be: there exists a time t from which the 
variables change of mean. Finally, the von Neumann ratio is based on the sum of the square’s 
differences between each pair of following time measures. The mean of this ratio is equal to 2 when 
the average of the time series is constant. The p-value is equal to 0.002, which leads us to reject the 
null hypothesis of homogeneity of the time series. This also confirms the preceding results. Von 
Neumann does not determine the time of change. 


When p is smaller than the specified significance level, e.g. 0.05, the null hypothesis is rejected. In 
other words, if a significant change point exists, and the time series was divided into two parts at the 
location of the change point. For all these four tests, if the test statistic exceeds the critical value at a 
certain confidence level, the null hypothesis will be rejected at that confidence level. 


The statistical analyses of every climatic time series must always be carried out for studying important 
time series characters, i.e., normality, homogeneity, seasonality, presence of trends and changes, etc. 


Agbonaye and Izinyon, 2021 8 1 


Nigerian Journal of Environmental Sciences and Technology (NIJEST) Vol 5, No. 1 March 2021, pp 76 - 90 


2.4.2. Pettit’s test (Non-Parametric Rank Test) 

This test developed by Pettitt 1s a nonparametric test, which is useful for evaluating the occurrence of 
abrupt changes in climatic records (Smadi and Zghoul, 2006). One of the reasons for using this test is 
that it is more sensitive to breaks in the middle of the time series (Wijngaard et al., 2003). Pettitt's 
test is a nonparametric test that requires no assumption about the distribution of data. Pettitt's test 1s an 
adaptation of the rank-based Mann-Whitney test that allows identifying the time at which the shift 
occurs. The statistic used for Pettitt’s test 1s computed as follows: The null and alternative hypotheses 
in this test are the same as in the Buishand test, and this test 1s also more sensitive to the breaks in the 
middle of the series (Costa and Soares, 2009). The ranks r...1, of the Y,...Y, 1s used to calculate the 
Statistics. 


k 
X, =2))_ 7, —-k(n41) k=1,2,....n (7) 
If a break occurs in year K, then the statistic is maximal or minimal near the year k = K: 


X, =max|X,, (8) 


l<k<n 





2.4.3. Standard Normal Homogeneity Test (SNHT) 

SNHT is one of the most popular homogeneity tests in climate studies. The null and alternative 
hypotheses in this test are the same as in the Buishand test; however, unlike the Buishand test, SNHT 
iS more sensitive to the breaks near the beginning and the end of the series (Costa and Soares, 2009). 
Alexandersson and Moberg (1997) proposed a statistic 7(k) to compare the mean of the first k years of 
the record with that of the last (n — k) years: 


T(k) = kzj* + (n—-—k)zz" k = 1,2,....,n (9) 
_ Cleey _ " (y-Y _ 
ners. me i 1 Deal%=¥) and S=~5¥" (y-Y) 1) 
n S n—k S ae 


If a break is located at the year K, then 7(k) reaches a maximum near the year k = K. The test statistic 
T, is defined as: 


1, = max T (k) (11) 


l<k<n 
The null hypothesis 1s rejected if 7) 1s above a certain level, which is dependent on the sample size. 


2.4.4. Buishand’s test (Parametric test) 

This test 1s used to detect a change in the mean by studying the cumulative deviation from the mean. 
It bases on the adjusted partial sums or cumulative deviation from the mean. According to Al-Ghazal1, 
and Alawadi (2014) the null hypothesis is that the data are homogenous and normally distributed and 
the alternative hypothesis is that there is a date at which a change in a mean occurs. 


This test supposes that tested values are independent and identically normally distributed (null 
hypothesis). The alternative hypothesis assumes that the series has a jump-like shift (break). This test 
is more sensitive to breaks in the middle of the time series (Costa and Soares, 2009). The test 
statistics, which are the adjusted partial sums (Buishand, 1982), are defined as Sy =O and 


, meee) 


Ss, =———_ as ee a (12) 


ye 7 y) 
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When series are homogeneous, the values of S;,* will fluctuate around zero because no systematic 
deviations of the Yi values concerning their mean will appear. Q-statistics: if a break is present in year 
K, then S;,* reaches a maximum (negative shift) or minimum (positive shift) near the year k = K. 


QO =max S, (13) 


O<k<n 


R-statistics (Range Statistics) are, 


R=(max S; --~ min 5; } (14) 


O<k<n O<k<n 


Buishand (1982) gives critical values for Q and R for different data set lengths random values; the 
alternative hypothesis 1s that the values in the series are not randomly distributed. 


2.4.5. Von Neumann Ratio Test (Non-Parametric Test) 

Von Neumann proposed a nonparametric test where the statistic is defined as the ratio of the mean 
square successive (year-to-year) difference to the variance. The null hypothesis is that the data are 
independent, identically distributed random quantities and the alternative’ is that the time series 1s not 
randomly distributed. Under the null hypothesis of a constant mean, the expected value of the test 
Statistic 1s equal to two. The von Neumann ratio test is not location-specific, which means that it gives 
no information about the date of the break. 


In this test, the null hypothesis is that the data are independent identically distributed. The von 
Neumann ratio N is defined as the ratio of the mean square successive (year to year) difference to the 
variance (von Neumann): 


a2 Tia y 
' =e, 
ae 07 =¥) 


Hereafter, for each of the test descriptions, n is the data set length, Y; 1s the i’ element of the data set, 
is the mean value of the data set. When the sample is homogeneous the expected value is N = 2. If the 
sample contains a break, then the value of N tends to be lower than this expected value. If the sample 
has rapid variations in the mean, then values of N may rise above two (Klein, 2007). This test gives no 
information about the location of the shift. Critical values for N for different data set lengths. 


N= (15) 


XLSTAT statistical software used a hypothesis testing method to detect homogeneity of the rainfall 
data. This software was used to execute the homogeneity analysis. The results were categorized into 
three classes, which are useful, doubtful, and suspect according to the number of tests rejecting the 
null hypothesis. Based on an alpha value of 0.05 (95% significance level), for p-value bigger than 
alpha value, the series was considered to be homogeneous. 


3.0. Results and Discussion 


3.1. Test result for reliability of CRU data (preliminary data validation) 


The descriptive statistics of NiMeT and CRU mean annual rainfall and temperature are presented in 
Tables 4 and 5. Table 4 shows that the mean annual rainfall varies from 3074.098 mm in Bayelsa and 
1263.575 mm in Lagos. The standard deviation varies from 96.9369mm to 160.7266mm while the 
skewness and Kurtosis vary from 0.06751 to - 0.01484 and -1.61271 to -0.57684 respectively. These 
values of skewness and Kurtosis are indicative that the rainfall series approximate to normal 
distribution. 
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Table 4: Descriptive statistics of NiMeT and CRU mean annual rainfall (1956- 1986 and 1987-2016) 


Climatic | Climatic | Mean SD Variance Range Sum mm Skewness Kurtosis 
Station Data mm mm mm 


Akwa 
Ibom 
Bayelsa 
Cross 
River 
Delta 
Ondo | NIMET | 119.3667 | 67.100 | 4502.435_ | 0 | 221.5 [221.5 | 1313.03 | -0.50953__ | -0.46507_| 
Rivers 





Table 5: Descriptive statistics of NiMeT and CTU mean annual temperature (1956- 1986 and 1987- 
2016) 


Climatic | Climatic ee Variance a Max ae a m Skewness_ | Kurtosis 
Station ae mm 


Akwa 
Ibom Perv [30.569 [4398 [1764 Paar | 32.333 [aon [366.83] 039776 [1.00895 — 
Cross 
River 
Delta 
pT CRU | 30.597 | 1.629 | 2.6558 | 28.046 | 32.490 | 4.44 | 367.162 | -0.43246_ | -1.4675__ 
Ondo 
Rivers 





The descriptive statistics of mean annual temperature (1956-2016), in the coastal region of Nigeria are 
presented in Table 5. The table was obtained by use of XL Statistic software as explained in Section 
3.3.1. The table shows that the mean annual temperature varies from 369.73°C in Ogun and 293.3°C 
in Bayelsa. The standard deviation varied from 1.856°C to 1.3242°C while the skewness and Kurtosis 
varied from -0.45921 to -0.32776 and -1.4675 to -0.85852 respectively. These values of skewness and 
Kurtosis are indicative that the rainfall series approximate to normal distribution. 


3.2. Result for differences in mean annual rainfall during the two climatic periods (simple t-Test) 


The results of values obtained from the computer output are presented in Table 6. The table depicts 
the values of the means in the two climatic periods in the states for which the Student t-test was used 
to establish if there were differences between them. The Pearson correlation, the number of 
observations (n) the degree of freedom (df = n-1) are shown. The computed f¢ statistics, as well as the f 
critical for one tail and two-tail test, are also presented. 


The hypothesis is stated as follows: 

The null hypothesis is that there is no statistical difference in mean seasonal rainfall distribution 
between the NIMET and CRU data. The alternate hypothesis is that there is a statistical difference in 
mean seasonal rainfall distribution between the NIMET and CRU data: 


Null hypothesis Ho: w= U2 
While the alternative hypothesis 1s H,: uF M2 1.€. Uy > U2 OF U7 < Lo. 
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Hence, two-tail tests are used for the analysis. Based on the f¢ statistic test statistic and the degrees of 
freedom, we determine the P-value. The P-value is the probability that a r statistic having 11 degrees 
of freedom is more extreme than 1.725. Since this is a two-tailed test, "more extreme" means greater 
than 1.725 or less than -1.725 1.e. - 1.725 < Ty) < 1.725. 


Table 6: Results of Simple t - test on comparison of NIMET and CRU data for rainfall 


Computed (n-2) (Two tail) 





Hence, there is no significant statistical difference in mean of seasonal rainfall distribution between 
the NIMET and CRU data at a significance level of 0.05. 


To determine the suitability/ adequacy of the CRU rainfall and temperature data for sixty-one years 
(1956-2017) for further analysis, the data were further subjected to validation with observed NIMET 
data covering the same period by comparison of their descriptive statistics using simple t test and the 
goodness of fit criterion such as Coefficient of Determination (R7), adjusted-R’, Mallows’ Process 
Capability (C,), Kendall Tau (K tau), Root Mean Squared (RSR) and Mean Absolute Percentage 
Error (MAPE). This is shown in Table 7 and 8 for rainfall and temperature respectively. 


_Table 7: Comparison between ae ae CRU data for rainfall — 


R2 
ae 


River 


|Delta_| 0.865 | V.good_| 0.83 | V.good | 0.818 | V.good [0.41 | Vgood | 66 [|T | 2 | OK 
|Edo__| 0.890 | V.good_| 0.867__| V.good | 0.709 | V.good | 0.369 _| V.good | 23.48 | R_ | 2 | OK 
|Lagos | 0.774 | V.good_| 0.709 | V.good | 0.709 | V.good | 0.539 | Good | 44.79 | R_ | 2. | OK 
Ogun __| 0.888 | V.good_| 0.850_| V.good | 0.745__| V.good | 0.381 | V.good | 50 | R_ | 2 | OK 


R = Reasonable, I = Inaccurate, OK = capable or Reliable 





Table 8: Comparison between NIMET and CRU data for temperature 
p p 


Ibom 


Bayelsa [0.954 | V.good | 0.949 | V.good | 0.891 | V.good | 0.225 | V.good | 0.588 [| H {2 | OK | OK | 


River 


|Delta_ | 0.949 | V.good | 0.94 | V.good_| 0.891 | V.good | 0.227 | V.good | 0.659 | H [2 | OK 


H = Highly Reliable, OK = capable or Reliable 
Tables 7 and 8 shows the values of R? obtained from the comparison. They range from 0.935 (Cross 
River) to 0.774 (Lagos) for rainfall and from 0.976 (Edo) to 0.883 (Ondo) for temperature. Following 
Table 2, the performance rating of R? is very good as 0.75 < R? < 1. With the minimum value of 
0.77788 for both rainfall and temperature, the performance rating of R? in validating the CRU data is 
very good. 
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The results of this validity analysis indicate that the CRU data obtained is very reliable and thus 
justified their use for the further analysis carried out in the study. 


RMSE close to zero and R-Square value approaching | is indicative of high accuracy between 
observed and predicted values. Based on a rule of thumb, it can be said that RMSE values between 0.2 
and 0.5 show that the model can relatively predict the data accurately. 


C,, processes capability sometimes referred to as Mallows’ C,, shows whether the distribution can 
potentially fit inside the specification. C, is an index used to assess the width of the process spread in 
comparison to the width of the specification. It is calculated by dividing the allowable spread by the 
actual spread. The allowable spread is the difference between the upper and lower specification limits. 
The actual spread is 6 times the estimated standard deviation. 


Cy = (USL — LSL)/60 (16) 
Where USL and LSL are upper and lower specification limits, o estimated standard deviation. 


3.3. Normality tests 


Three common normality tests were carried out, namely the Shapiro Wilk Test (SWT), D°Agostino- 
Pearson Test, and Skewness Test. The results are presented in Tables 9 to 12. When the p-value is 
larger than the significance level, the decision is to fail to reject the null hypothesis because we do not 
have enough evidence to conclude that our data do not follow a normal distribution. The results 
indicate that the three normality tests were in agreement that rainfall series follow a normal 
distribution. Normal distribution assumption is very crucial for the reliability of results especially for 
parametric tests. The tables show values of test statistics, the p-values, and significant level alpha = 
0.05. It also provided for confirmation of normality as it indicated YES. To determine whether the 
data do not follow a normal distribution, we compare the p-value to the significance level. Usually, a 
significance level (denoted as o or alpha) of 0.05 works well. A significance level of 0.05 indicates 
that the risk of concluding the data do not follow a normal distribution—when the data do follow a 
normal distribution—is 5%. 


Table 9: Results of test for normality of spatial rainfall data (normality acceptable if -0.59 <s 
< 0.59) 


Location | SHAPIRO WILK TEST (SWT) D°AGOSTINO-PEARSON TEST SKEWNESS TEST 
W STAT Alpha DASTAT | Pvalue | alpha_| normal | S| alpha | 


Ibom 


Cross 0.92222 0.33758 0.05 YES 2.438 0.2955 0.05 YES -0.0152 0.05 YES 


River 





Table 10: Results of test for normality of spatial temperature data (normality acceptable if - 
0.59 < s< 0.59) 
Location 


W STAT P-value Alpha Normal DA P value | alpha | Normal S alpha 
(a) (p>a) STAT (p>a) 
Ibom 


0.95524 0.71132 | 0.05 YES 0.9932 0.6086 | 0.05 YES -0.33311 | 0.05 YES 
River 
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The five tests scored Hp for Edo, Lagos, Ogun, and Ondo state Rainfall data and H, for Cross Rivers 
and Rivers States rainfall data. Three tests scored H, and two tests scored Hp for Akwa Ibom rainfall 
data. The average score could be considered as H,. Also, three tests scored Ho and two tests scored H, 
for Bayelsa and Delta States rainfall data. The average score could be considered as Ho. 


Table 11a: Summary of homogeneity test for rainfall data (Pettit’s test and SNHT) 


State 
-CrossRiver_} 18 _| 198 _{ 0.00 _[ Ha_{ 21.44 _| 1984 | 0.000_| Ha_ 


-Edo__| 149__} 2011 {0.32 _| Ho {3.429 __[ 2011 | 0.618 _| Ho _ 
|Rivers | 466 | 1980 [0.00 | Ha | 14.12 | 1969 | 0.002 | Ha | 


Table 11b: Summary of homogeneity test for rainfall data (Bush and Von Neumann test) 


State -BUISHAND TEST —__{___| VONNEUMANN ___ 
Bayeka [6.616 [2011 | 0.369 Ho [8227 [0547 | Ho [a2 





0.000 _| 0.00-_{ 

Ha 
Ho 
Ho 

| Rivers | 13.33 | 1980 | 0.002 | Ha | 13.899 | 0.011 | Ha | 1.124 [0.00 | Ha | 


Table 12a: Summary of homogeneity test for temperature data 


Location 


Table 12b: Summary of homogeneity test for temperature data 


Location 
| Q-Value_| Year | Trend | R-Value | Trend |N |__| 


Test interpretations of rainfall data continued 


Ho: Data are Homogeneous Ho: Data are Homogeneous 
H,: There is a date at which there is a change in the | H,: There is a date at which there is a change in the 











data. As the computed p-value is greater than the | data. As the computed p-value is lower than the 
significance level alpha = 0.05, one cannot reject the | significance level alpha = 0.05, one should reject the 
null hypothesis Ho. We have to accept the null | null hypothesis Hp, and accept the alternate 
hypothesis that the data is homogeneous hypothesis H, that the data is inhomogeneous 
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3.3.1. Test interpretations of temperature data 

The five tests scored H, for all the States in the study area for temperature data 
Ho: Data are Homogeneous 

H,: There is a date at which there is a change in the data 


As the computed p-value (<0.0001) is lower than the significance level alpha = 0.05, one should reject 
the null hypothesis Ho, and accept the alternate hypothesis H, that the data is not homogeneous. 


CRU TS is not specifically homogeneous. Some National Meteorological Agencies (NMAs) 
homogenize their station observations, either before release or at a later stage (requiring a re-release). 
Therefore, many CRU TS observations have been homogenized (and also quality controlled) within 
each country. However, performing additional homogenization on the CRU TS databases would be 
complicated and not completely possible because of elements of the process, such as partly synthetic 
variables and the use of published climatology. The result obtained from homogeneity test is 
presented in Figures 2a to 2] and Table 13. 
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Figure 2c: Edo State rainfall Pettitt test Figure 2d: Edo State temperature Pettitt test 
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Figure 2e: Cross River State rainfall Petitt test Figure 2f: Cross River State temperature 
Pettitt test 
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Figure 2g: Delta State rainfall SNHT test Figure 2h: Delta State temperature SNHT 
test 


mu2 = 30.9158 
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Figure 2i: Akwa Ibom State rainfall Buishand’s test Figure 2j: Akwa Ibom State temperature 
Buishand’s test 


Table 13: Summary of abrupt change result obtained from homogeneity test (Figure 2a to 2)) 





1980 1980 


4.0. Conclusion 


The five homogeneity tests conducted indicated that CRU rainfall data for Bayelsa, Delta Edo, Lagos, 
Ogun, and Ondo states are homogeneous those for Akwa Ibom, Cross Rivers, and Rivers States are 
inhomogeneous. Also, the five homogeneity tests conducted indicated that CRU temperature data for 
all the States in the study area are inhomogeneous. 


National Meteorological Agencies (NMAs) homogenize their station observations, either before 
release or at a later stage (requiring a re-release). Therefore, many CRU TS observations have been 
homogenized (and also quality controlled) within each country. Hence, inhomogeneity observed may 
not be due to data quality but to climate variability and climate change. For temperature, the year of 
significant abrupt change was likely in 1980 where there were breakpoints. The study also indicated 
that 1980 was the driest year in Lagos state. 
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